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Phylogenomic approaches have shown 
that eukaryotes acquire genes via 
gene transfer. However, there are two 
fundamental problems for most of these 
analyses; only transfers from prokaryotes 
are analyzed and the screening procedures 
applied assume that gene transfer is rare 
for eukaryotes. Directed studies of the 
impact of gene transfer on diverse 
eukaryotic lineages produce a much more 
complex picture. Many gene families are 
afiFected by multiple transfer events from 
prokaryotes to eukaryotes, and transfers 
between eukaryotic lineages are routinely 
detected. This suggests that the assump- 
tions applied in traditional phylogenomic 
approaches are too naive and result in 
many false negatives. This issue was 
recently addressed by identifying and 
analyzing the evolutionary history of 
49 patchily distributed proteins shared 
between Dictyostelium and bacteria. The 
vast majority of these gene families 
showed strong indications of gene trans- 
fers, both between and within the three 
domains of life. However, only one of 
these was previously reported as a gene 
transfer candidate using a traditional 
phylogenomic approach. This clearly 
illustrates that more realistic assumptions 
are urgently needed in genome-wide 
studies of eukaryotic gene transfer. 



Transfer of genetic material between 
different organismal lineages is important 
in prokaryote evolution. Studies of single 
gene families as well as phylogenomic 
studies in the last decade have shown 
that also eukaryotes are affected by this 
evolutionary mechanism. ^'^ However, the 
importance of the process is still uncertain; 
only modest numbers of gene transfer 



candidates are typically reported from 
eukaryotic genome projects,^'^ whereas 
directed studies suggest gene transfer to 
be important in the adaptation process of 
eukaryotes. This is an intriguing incon- 
gruity. To understand these differences 
it is useful to consider the assumptions 
applied in the screening procedures in the 
phylogenomic approaches used in genome 
projects (referred to as 'traditional phylo- 
genomic approaches' herein) and how well 
they match the knowledge we currently 
have from more directed studies of 
eukaryote gene transfer. Here I will argue 
that the match is really poor leading to a 
high number of false negatives. 

Studies of gene transfers in eukaryotes 
using phylogenomic methods typically 
identify eukaryotic proteins with high 
sequence similarity to a prokaryotic pro- 
tein, but with no or significantly weaker 
similarity to any eukaryotic protein.^'^ 
This is indeed a strong indication of a 
gene transfer event. The problem is that 
these traditional phylogenomic approaches 
only identifies protein families in which a 
single transfer has occurred between a 
prokaryote and a eukaryote, which prob- 
ably is a rarity. Comparative genomics 
studies of prokaryotes have shown that 
most protein families are patchily distri- 
buted; they are absent from a small or large 
fraction of the genomes (Fig. 1).^ These 
genes are distributed via gene transfer and 
provide diversifying functions and niche 
adaptation to the recipients. Eukaryote 
genome evolution, on the other hand, has 
been viewed as mainly influenced by 
genome expansion and a few major endo- 
symbiotic events. However, there are data 
suggesting that gene transfer of patchily 
distributed proteins is important for the 
diversification process also for eukaryotes. ^'^ 
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Figure 1. Patchily distributed proteins are distributed via gene transfer. (A) Comparative genomics of prokaryotes have identified three loosely defined 
groups of gene families based on their frequencies in genomes: extended core, character and accessory genes.^ Core genes encode shared function 
between organisms, character genes functions that distinguish major groups and accessory genes functions unique to a few organisms. The evolutionary 
mode differs between the groups. Core genes are vertically inherited and used for organismal phylogenies, whereas character and accessory genes 
probably are more influenced by gene transfer; none of the groups should be viewed as representatives of the evolution of the whole genome. 
The evolution of the accessory and character genes were studied by identifying 49 patchily distributed protein families present in the cellular slime mold 
Dictyostelium discoideum and bacteria.^ (B) The maximum likelihood phylogeny of a conserved hypothetical protein identified in the study. Eukaryotes 
are shown in color and prokaryotes in black. Distantly related eukaryotes are found intermixed with prokaryotic sequences, suggestive of multiple 
transfer events.^ The figure is adapted from references 8 and 9. 



Here I will review some recent findings 
of adaptation by gene acquisition in 
eukaryotes obtained by the usage of 
phylogenomic approaches for the study 
of gene transfers between specific eukaryo- 
tic groups,^ '' and one attempt to use a 
novel approach to study patchily distri- 
buted proteins.^ With an increased under- 
standing of the evolutionary dynamics of 
all classes of gene families (Fig. 1) we can 
apply realistic assumptions to large-scale 
studies of eukaryotic gene transfer. 

Directed Studies Identify Gene 
Sharing Leading to Adaptation 

Many of the most devastating diseases in 
plants are caused by fungi or oomycetes. 
These are two distantly related groups of 
eukaryotes that have similar lifestyles. 
Both feed by osmotrophy. The cells secret 
enzymes that decompose organic matter 
and the metabolites are imported into the 
cell. The similarity in lifestyle between the 
groups is an example of convergent 
evolution. Fungi are more closely related 
to animals than to oomycetes, whereas 
diatoms, a group of photosynthetic algae, 
are a sister group to oomycetes. Absence 
of phagotrophy has been assumed to be 
a barrier to gene transfer. Indeed, the 
oomycete and fungi genomes are not 
among the genomes for which traditional 
phylogenomic studies have indicated a 
significant role of gene transfer in eukar- 
yotes. Nevertheless, targeted evolutionary 



studies have suggested that gene transfer 
contributed to the similarities between the 
groups.^^'^^ 

Richards and coworkers studied this 
phenomenon further.^ They could identify 
dozens of gene transfers between the 
groups using a wide range of genomes 
from both groups together with clustering 
and phylogenetic methods. Interestingly, 
all transfers except one were reported 
to have occurred in the direction from 
fungi to oomycetes. Many of the trans- 
ferred genes encode secreted decomposing 
enzymes and were specifically acquired by 
plant- tissue colonizing oomycetes. These 
results show that oomycete most likely are 
more recent plant pathogens than fungi 
and that transfer of genetic material from a 
distantly related eukaryotic group have 
played an important role in evolution of 
their pathogenic lifestyle.'' These fascin- 
ating results would not have been obtained 
with a traditional phylogenomic approach 
in which genes with strong sequence 
similarities to other eukaryotes would 
have been assumed to be present in the 
common eukaryotic ancestor. 

Studies of gene transfer are indeed able 
to shed light on the diversification process 
of eukaryotes. Animals and fungi are both 
members of Opisthokonta. No photo- 
synthetic member has been identified in 
this group. Choanoflagellates are a group 
of free-living microbial eukaryotes which 
are the closest relatives to animals. Sun and 
coworkers used a directed phylogenomic 



approach to search for genes of algal origin 
in the genome of Monosiga hrevicollis, a 
phago trophic unicellular choanoflagellate.'^ 
They reconstructed phylogenetic trees for 
all genes in the genome. Using realistic 
filtering criteria they were able to identify 
103 genes with strong support for algal 
origin, mostly from haptophytes, diatoms 
and green algae. This could be the result 
of repeated transfer of genes from food; 
choanoflagellates feed on bacteria and 
other eukaryotes. Alternatively, or rather 
in addition, the genes could have been 
introduced from a past algal endosymbiont 
in the lineage leading to Monosiga.'^ 
Interestingly, a quarter of the identified 
genes appeared to first have been trans- 
ferred from bacteria to a eukaryotic alga, 
and then secondarily to choanoflagellates.'^ 
However, such a bacterial origin would 
not have been detected using a typical 
phylogenomic approach since the strongest 
sequence similarity would be to algal gene 
of bacterial origin. Functions in amino 
acid and carbohydrate metabolism domi- 
nated among the gene transfer candidates, 
indicating that these choanoflagellates have 
adapted by acquisition of algal genes that 
expand their metabolic repertoire. 

A Novel Approach to Study 
Patchily Distributed Proteins 

The two examples outline above test well- 
defined hypotheses about gene transfers 
by using directed phylogenomic methods 
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in combination with careful filtering 
and interpretation of the results. They 
are very powerful to characterize the role 
of gene transfers between distantly related 
eukaryotic groups in the adaptation of 
eukaryotes. However, one disadvantage 
is that the approach relies on existing 
knowledge of the biology of the organisms; 
if only transfers between two organismal 
groups are addressed important contribu- 
tions from other groups may be missed in 
such phylo genomic approaches. In addi- 
tion, these kinds of studies can only 
estimate minimal number of transfers 
between the groups analyzed; the number 
of false negatives may be large. To 
circumvent these problems I applied an 
alternative approach to study these issues. 
I first identified patchily distributed 
proteins because these are expected to 
be enriched with gene transfer events 
(Fig. 1), instead of screening for unexpec- 
ted sequence similarities. Then I per- 
formed phylogenetic analyses for each 
identified gene family to evaluate whether 
the patchy distribution was a consequence 
of gene transfer, or differential loss in the 
eukaryotic domain.^ 

The soil-dwelling cellular slime mold 
D. discoideum was selected in the case 
study for two reasons: an active research 
community have produced a high quality 
annotation of the genome sequence 
(http://dictybase.org/), and only 18 poten- 
tial gene transfers were reported in the 
original publication.^ I identified 49 
protein families in the Dictyostelium 
genome which were shared with at least 
one prokaryotic species, but only a 
limited number of other eukaryotes and 
prokaryotes (Fig. 1). The evolutionary 
history of these patchily distributed 
families were analyzed further.^ For seven 
of the families there were no eukaryotic 
sequences except the Dictyostelium 
sequences. The remaining 42 families 
contained sequences from one or more 
eukaryotic species outside the Dictyo- 
stelium genus. The closest relative with a 
completely sequenced genome, the human 
parasite E. histolytica^ was represented 
in only two families. In contrast, the 
amoeboflagellate Naegleria gruberi had a 
representative in 25 of the families. 
Dictyostelium and Naegleria are having 
somewhat overlapping lifestyles, they are 



both free-living heterotrophs that can be 
found in soil and they both undergo cell 
differentiation under certain conditions. 
However, they are distantly related 
eukaryotes classified within two different 
supergroups: Amoebozoa and Excavata. 

There exist at least two alternative 
plausible explanations for this striking 
gene-sharing pattern. These genes were 
present in the common ancestor of the 
Dictyostelium and Naegleria and distri- 
buted in eukaryotes strictly by vertical 
inheritance.^ In lineages that have different 
lifestyles (i.e., parasites) the genes have 
become obsolete and lost over evolution- 
ary time. Alternatively, the genes have 
been distributed via gene transfer in more 
recent evolutionary timescales providing 
selective advantage to the recipient 
lineages. Phylogenetic analyses were per- 
formed on all protein families to distin- 
guish between these alternatives. The 
results were striking. The vast majority 
of the phylogenetic trees showed strong 
indications of lateral gene transfer between 
prokaryotes and eukaryotes and within 
eukaryotes.^ Figure IB shows an example 
of an individual gene tree. The exact 
details of the transfer events could in 
many cases not be traced, because the 
density of organismal sampling was too 
low. Nevertheless, there are no strong 
indications that any of the proteins have 
evolved solely via vertical inheritance 
and gene loss; gene transfer has likely 
affected all patchily distributed genes 
families identified in the analysis to some 
extent.^ 

Traditional Phylogenomic Studies 
have Drastically Underestimated 
the Amount of Gene Transfer 

Only a single protein among the 49 
identified as patchily distributed was 
among the 18 gene transfer candidates 
in the original D. discoideum genome 
publication,^ and very few were among 
the 184 lateral gene transfer candidates 
reported from A^. gruberi.'^ This may be 
surprising, but is logical if the details of 
the methods applied are considered. 
Dictyostelium genes with significant 
similarity to a bacterial-specific Pfam 
domain and only present in Dictyo- 
stelium among eukaryotes were considered 



as gene transfer candidates.^ This conser- 
vative approach is unlikely to pick up false 
positives, but will be very prone to false 
negatives. Genes acquired via gene transfer 
in two or more different eukaryotes are 
excluded, as are any genes without 
sufficient sampling among prokaryotes 
to be included in Pfam. Similarly, the 
N. gruberi gene set was screened with 
similarity searches, and genes with signi- 
ficant similarity only to prokaryotes were 
considered as gene transfer candidates.^ 
Again, gene families with repeated 
transfers are missed in the screen and 
eukaryote-to-eukaryote transfers are not 
even considered. The true number of gene 
families in these microbial eukaryotes are 
likely much larger than has been reported. 

These discrepancies should not be 
surprising from a biological viewpoint. 
Microbes live in steadily changing environ- 
ments. Ecosystems are inhabited by dis- 
tantly related organisms which have 
adapted to its specific condition. The 
spread of patchily distributed genes are 
part of this adaptation process, and there 
is no reason to assume that microbial 
eukaryotes do not take part of this flux 
of genetic material (Fig. l)/'^'^-^ Pqj. 
example, if a gene provide the ability to 
utilize a carbon compound present in the 
environment it is likely to spread to 
different microbes in the environment 
previously lacking this ability (provided 
that there are mechanisms in action). The 
assumption that a gene has a vertical 
eukaryotic history is violated as soon as 
two eukaryotic lineages inhabit a similar 
environment and acquire their copy of 
a particular gene family independently 
during the adaptation process. The tradi- 
tional phylogenomic approaches will fail 
to identify members of such protein 
families as gene transfer candidates because 
they assume that vertical inheritance is 
the norm for all protein families with gene 
transfer events as very rare exceptions. 
However, this is probably only the case for 
universal core genes, and certainly not 
for patchily distributed proteins (Fig. 1).^ 
Traditional phylogenomic approaches 
probably only have scratched the surface 
of the gene transfer events and thereby 
drastically underestimated the impact 
of the process on eukaryotic genome 
evolution. 
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